簡易檢索 / 詳目顯示

研究生: 魏震豪
論文名稱: 基於雙眼立體影片藉由迭代式校正深度圖合成多視角影片
Multi-view video synthesis from stereo videos with iterative depth refinement
指導教授: 賴尚宏
口試委員: 賴尚宏
莊永裕
杭學鳴
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2013
畢業學年度: 101
語文別: 英文
論文頁數: 61
中文關鍵詞: 影像合成深度圖修正多視角
外文關鍵詞: view synthesis, depth refinement, multi-view
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在本篇論文中,我們提出了一個修正立體影像對的深度圖和從兩個視角包含影像和深度的序列去產生多個虛擬視角序列的新演算法並可用於自動立體顯示器。為了要在虛擬視角產生真實的景象,真實的深度資訊是不可缺少的,因此,修正估測所獲得的不精確深度圖是最主要的挑戰。首先,為了處理深度圖裡面的錯誤像素,我們提出了一個包含了錯誤偵測和錯誤校正的迭代深度修正方法。錯誤像素可以被區分成視角之間和序列前後幀之間的影像深度不一致像素兩種,接著為了校正這些像素,我們取樣一定範圍內相似的像素群的平均深度值來進行校正。為了產生的序列的穩定性,我們套用包含了時間、空間和影像強度三種項目的三向濾波器,它可以加強時間上和空間上的一致性以及減少殘留的雜訊。當兩個視角的深度圖都被修正之後,精確的虛擬視角影像序列就可以被合成。在將兩個視角的影像做平移到虛擬視角之後,我們提出了根據深度資訊的視角內插方法將兩張同視角的影像合成一張,這種方法可以避免前背景交錯的錯誤。最後,為了消除在物體邊緣的鋸齒我們提出了方向性濾波器,這是一種只考慮特定方向的空間濾波器。經過上述的步驟後,我們可以獲得在虛擬視角的高品質影像序列。經過評估合成的影像和影片,提出的方法展示了合成影像的高品質。


    In this thesis, we propose a novel algorithm to refine depth maps and generate multi-view video sequences from two-view video sequences for modern autostereoscopic display. In order to generate realistic contents for virtual views, high-quality depth maps are very critical to the view synthesis results. Therefore, refining the depth maps is the main challenging problem in the task. We propose an iterative depth refinement algorithm, including error detection and error correction, to correct errors in depth map. The error types are classified into across-view color-depth-inconsistency errors and local color-depth-inconsistency errors. Then, we correct the error pixels based on sampling local candidates. Next, we apply a trilateral filter that considers intensity, spatial and temporal terms into the filter weighting to enhance the temporal and spatial consistencies across frames. So the virtual views can be synthesized according to the refined depth maps. To combine both warped images, disparity-based view interpolation is introduced to alleviate the translucent artifacts. Finally, a directional filter is applied to reduce the aliasing around the object boundaries. Finally, the high-quality virtual views between the two views are generated. We demonstrate the superior image quality of the synthesized virtual views by using the proposed algorithm over the state-of-the-art view synthesis methods through experiments on benchmarking image and video datasets.

    Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Problem Description 2 1.3 Main Contribution 5 1.4 Thesis Organization 6 Chapter 2 Related Work 7 2.1 Stereo Matching 7 2.1 Depth Refinement 8 2.1 View Synthesis 10 Chapter 3 Proposed Method 14 3.1 Iterative Depth Refinement 14 3.1.1 Error detection 15 3.1.2 Error Correction 18 3.2 Spatial-Temporal Smoothing 22 3.3 View Synthesis 25 3.3.1 Image Warping with Artifacts Removal 26 3.3.2 Disparity-Based View Interpolation 29 3.3.3 Consistent Disocclusion region recovering 31 3.3.4 Boundary Refinement 34 Chapter 4 Experimental Results 36 4.1 Datasets 36 4.2 Environment Setting 38 4.3 Images Evaluation 38 4.4 Sequences Evaluation 44 Chapter 5 Conclusion 51 Reference 53

    [1] M. Bleyer and M. Gelautz, “Graph-cut-based stereo matching using image segmentation with symmetrical treatment of occlusions,” Image Commun., vol. 22, no. 2, pp. 127–143, Feb. 2007.
    [2] Y. Deng, Q. Yang, X. Lin, and X. Tang, “A symmetric patch-based correspondence model for occlusion handling,” in Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2, ser. ICCV ’05. Washington, DC, USA: IEEE Computer Society, 2005, pp. 1316–1322.
    [3] Q. Yang, L. W. 0002, R. Yang, H. Stewnius, and D. Nistr, “Stereo matching with color-weighted correlation, hierarchical belief propagation and occlusion handling.” in CVPR (2). IEEE Computer Society, 2006, pp. 2347–2354.
    [4] O. Veksler, “Stereo correspondence by dynamic programming on a tree.” in CVPR (2). IEEE Computer Society, 2005, pp. 384–390.
    [5] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47, no. 1-3, pp. 7–42, 2002.
    [6] J. Lu, D. Min, R. S. Pahwa, and M. N. Do, “A revisit to mrf-based depth map super-resolution and enhancement.” in ICASSP. IEEE, 2011, pp. 985–988.
    [7] J. Zhu, L. W. 0002, J. Gao, and R. Yang, “Spatial-temporal fusion for high accuracy depth maps using dynamic mrfs.” IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 5, pp. 899–909, 2010.
    [8] C. Rhemann, A. Hosni, M. Bleyer, C. Rother, and M. Gelautz, “Fast cost-volume filtering for visual correspondence and beyond.” in CVPR. IEEE, 2011, pp. 3017–3024.
    [9] J. Park, H. Kim, Y.-W. Tai, M. S. Brown, and I.-S. Kweon, “High quality depth map upsampling for 3d-tof cameras.” in ICCV, D. N. Metaxas, L. Quan, A. Sanfeliu, and L. J. V. Gool, Eds. IEEE, 2011, pp. 1623– 1630.
    [10] K. Lai, L. Bo, X. Ren, and D. Fox, “A large-scale hierarchical multiview rgb-d object dataset.” in ICRA. IEEE, 2011, pp. 1817–1824.
    [11] J. Dolson, J. Baek, C. Plagemann, and S. Thrun, “Upsampling range data in dynamic environments.” in CVPR. IEEE, 2010, pp. 1141–1148.
    [12] Q. Yang, R. Yang, J. Davis, and D. Nistr, “Spatial-depth super resolution for range images.” in CVPR. IEEE Computer Society, 2007.
    [13] J. Diebel and S. Thrun, “An application of markov random fields to range sensing.” in NIPS, 2005.
    [14] S. Cho, J. Ha, and H. Jeong, “Image synthesis and occlusion removal of intermediate views by stereo matching,” in Proceedings of the 11th Asian conference on Computer Vision - Volume Part II, ser. ACCV’12. Berlin, Heidelberg: Springer-Verlag, 2013, pp. 368–381.
    [15] J. Yang, X. Ye, K. Li, and C. Hou, “Depth recovery using an adaptive color-guided auto-regressive model.” in ECCV (5), ser. Lecture Notes in
    Computer Science, A. W. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato, and C. Schmid, Eds., vol. 7576. Springer, 2012, pp. 158–171.
    [16] S. Matyunin, D. Vatolin, Y. Berdnikov, and M. Smirnov, “Temporal filtering for depth maps generated by kinect depth camera,” in IEEE 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON), 2011, pp. 1–4.
    [17] D. Min, J. Lu, and M. N. Do, “Depth video enhancement based on weighted mode filtering.” IEEE Transactions on Image Processing, vol. 21, no. 3, pp. 1176–1190, 2012.
    [18] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester, “Image inpainting,” in Proceedings of the 27th annual conference on Computer graphics and interactive techniques, ser. SIGGRAPH ’00. New York, NY, USA: ACM Press/Addison-Wesley Publishing Co., 2000, pp. 417–424.
    [19] S.-J. Lin, C.-M. Cheng, and S.-H. Lai, “Spatio-temporally consistent multi-view video synthesis for autostereoscopic displays,” in Proceedings of the 10th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing, ser. PCM ’09. Berlin, Heidelberg: Springer-Verlag, 2009, pp. 532–542.
    [20] L. Zhang and W. J. Tam, “Stereoscopic image generation based on depth images for 3d tv,” Broadcasting, IEEE Transactions on, vol. 51, no. 2, pp. 191–199, 2005.
    [21] A. Criminisi, P. Prez, and K. Toyama, “Object removal by exemplarbased inpainting.” in CVPR (2). IEEE Computer Society, 2003, pp. 721–728.
    [22] K.-J. Oh, S. Yea, and Y.-S. Ho, “Hole filling method using depth based in-painting for view synthesis in free viewpoint television and 3-d video,” in Proceedings of the 27th conference on Picture Coding Symposium, ser. PCS’09. Piscataway, NJ, USA: IEEE Press, 2009, pp. 233–236.
    [23] K. A. Patwardhan, G. Sapiro, and M. Bertalmo, “Video inpainting of occluding and occluded objects.” in ICIP (2). IEEE, 2005, pp. 69–72.
    [24] L. Wang, H. Jin, R. Yang, and M. Gong, “Stereoscopic inpainting: Joint color and depth completion from stereo images.” in CVPR. IEEE Computer Society, 2008.
    [25] C. Jin and H. Jeong, “Intermediate view synthesis for multiview 3d displays using belief propagation-based stereo matching,” in Proceedings of the 2008 Third International Conference on Convergence and Hybrid Information Technology - Volume 01, ser. ICCIT ’08. Washington, DC, USA: IEEE Computer Society, 2008, pp.919–924. [Online]. Available: http://dx.doi.org/10.1109/ICCIT.2008.212
    [26] R. Haddad and A. Akansu, “A class of fast gaussian binomial filters for speech and image processing,” Trans. Sig. Proc., vol. 39, no. 3, pp. 723–727, Mar. 1991. [Online]. Available: http://dx.doi.org/10.1109/78.80892
    [27] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images.” in ICCV, 1998, pp. 839–846.
    [28] Q. Yang, K.-H. Tan, and N. Ahuja, “Real-time o(1) bilateral filtering.” in CVPR. IEEE, 2009, pp. 557–564.
    [29] Q. Yang, “A non-local cost aggregation method for stereo matching.” In CVPR. IEEE, 2012, pp. 1402–1409.
    [30] vision.middlebury.edu/stereo/.
    [31] ETRI/ MPEG-Korea Forum.
    [32] Fraunhofer HHI.
    [33] View Synthesis Software Manual, release 3.5, MPEG ISO/IEC
    JTC1/SC29/WG11, 2009.
    [34] OpenMP Application Program Interface version 3.0, OpenMP Architecture
    Review Board, 2008.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE