從影片重建具有細節的深度圖｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	曾聖博 Tseng, Sheng-Po
論文名稱：	從影片重建具有細節的深度圖 Recovering detail-preserving depth maps from a video sequence
指導教授：	賴尚宏 Lai, Shang-Hong
口試委員:
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2010
畢業學年度：	98
語文別：	英文
論文頁數：	45
中文關鍵詞：	三維重建、影片、深度圖
外文關鍵詞：	3D reconstruction, video, depth maps
相關次數：	點閱：2 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在這篇論文中，我們提出一個可根據室外景影片來建立深度圖的系統。由於影片的種種特性，我們的作法比傳統深度重建的方式參考了更多在時間域上的資訊。
首先我們在影片的連續影格上找出尺度不變特徵轉換的連續對應點，並利用它來實做基於運動的三維重建，以得到影片中所有影像對應的攝影機資訊，包含了位移與旋轉等等。接著，我們針對一些選出的影格計算出有限制的光流法資訊，藉此我們可進一步利用過度限制的線性系統來解出每一張影格的預測深度圖。之後，使用基於平均值移動影像分割來減少無紋理區域的錯誤及異常值。如此一來，參考預測深度圖、分割結果及其他物理限制，便可建立出初始深度圖。此初始深度圖可做為用來建立最終深度圖之馬可夫隨機場的資料項。經由最小化每張影格馬可夫隨機場的能量函數，我們可以讓初始深度圖成為視覺上舒適、能保留細節，且在時間域上連續的深度結果。

In this thesis, we propose a novel system to estimate the depth of outdoor scenes from a video sequence. According to the characteristics of a video, our approach considers more information in the temporal domain than the traditional depth reconstruction methods.
We perform Structure From Motion (SfM) on images sampled from a video by extracting and matching a set of Scale Invariant Feature Transform (SIFT) feature points. This provides some camera information, including 3D translation and rotation, for all the images. Then, we compute the constrained optical flow between selected scenes so that we can solve an over-constrained linear system to estimate the depth map for each frame. After that, mean shift image segmentation [11] is applied to alleviate the estimation problem with textureless regions and outlier points. The initial depth maps can be done by incorporating predicted depth maps, segmentation results, and some geometric constraints. This initial depth map becomes the data term of our pixel-based and region-based Markov Random Field formulation for depth map estimation. By minimizing the associated MRF energy function for each frame, we can refine the depth maps to achieve visually pleasing, detail-preserving and temporally consistent depth estimation results.

Contents
  Introduction    1
1    Motivation    1
2    Problem Description    2
3    Previous works    2
4    Main Contribution    7
5    Thesis Organization    7
  Proposed Method    9
1    Corresponding Points Searching    11
2    Structure from Motion    13
3    Constrained Optical Flow    15
4    Mean Shift Image Segmentation    18
5    Depth Map Computation    19
6    Sky Detection    22
7    Depth Map Optimization with pixel-based Markov Random Field    23
8    Depth Map Optimization with region-based Markov Random Field    26
  Experiment Results    29
1    Depth Maps with/without Sky Detection    29
2    Depth Maps with/without Mean Shift Segmentation    32
3    Comparison of Pixel-based and Region-based MRF    34
4    Depth Estimation from Real Data    36
  Conclusion    39
1    Summary    39
2    Future directions    40
  References    41

                                

[1] V. Hedau, D. Hoiem, and D. Forsyth. Recovering the spatial layout of cluttered rooms. In ICCV, 2009.
[2] S. Yu, H. Zhang, and J. Malik. Inferring spatial layout from a single image via depth-ordered grouping. In the 6th IEEE Computer Society Workshop on Perceptual Organization in Computer Vision, Anchorage, Alaska, 23 June 2008.
[3] A. Saxena, M. Sun, and A. Y. Ng. Make3D: Learning 3D Scene Structure from a Single Still Image. In PAMI, 2008.
[4] B. Liu, S. Gould, D. Koller. Single Image Depth Estimation From Predicted Semantic Labels. In CVPR, 2010.
[5] O. Pele, M. Werman. A Linear Time Histogram Metric for Improved SIFT Matching. In ECCV, 2008.
[6] Z. Wang, Z. Zheng. A Region Based Stereo Matching Algorithm Using Cooperative Optimization. In CVPR, 2008.
[7] L. Xu, J. Jia. Stereo Matching: An Outlier Confidence Appraoch. In ECCV, 2008.
[8] A. Klaus, M. Sormann, K. Karer. Segment-Based Stereo Matching Using Belief Propagation and a Self-Adapting Dissimilarity Measure. In ICPR, 2006.
[9] D. Martinec, T. Pajdla. 3D Reconstruction by Fitting Low-Rank Matrices with Missing Data. CVPR 2005, pp. 198-205, IEEE June 2005.
[10] M. Pollefeys, L. Van Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, R. Koch, Visual modeling with a hand-held camera, International Journal of Computer Vision 59(3), 207-232, 2004.
[11] D. Comanicu, P. Meer: "Mean shift: A robust approach toward feature space analysis". IEEE Trans. Pattern Anal. Machine Intell., May 2002.
[12] D. Hoiem, A. A. Efros, and M. Hebert. Automatic photo pop-up. In SIGGRAPH, 2005.
[13] A. Saxena, S. H. Cheng, A. Y. Ng. Learning Depth from Single Monocular Images. In NIPS, 2005.
[14] A. Saxena, S. H. Cheng, A. Y. Ng. 3-D depth reconstruction from a single still image. In IJCV, 2007.
[15] A. Saxena, J. Schulte, A. Y. Ng. Depth estimation using monocular and stereo cues. In IJCAI, 2007.
[16] M. Brown, D. G. Lowe. Unsupervised 3D object recognition and reconstruction in unordered datasets. In Proceedings of the international conference on 3D digital imaging and modeling, 2005.
[17] Noah Snavely, Steven M. Seitz, Richard Szeliski. Modeling the World from Internet Photo Collections. International Journal of Computer Vision, 2007.
[18] M. Lourakis, A. Argyros, (2004). The design and implementation of a generic sparse bundle adjustment software package based on the Levenberg–Marquardt algorithm (Technical Report 340). Inst. of Computer Science-FORTH, Heraklion, Crete, Greece.
[19] G. Zhang, J. Jia, T. Wong, H. Bao. Recovering Consistent Video Depth Maps via Bundle Optimization. In CVPR, 2008.
[20] B. K. P. Horn, B. G. Schunck, Determine optical flow. Artificial Intelligence, vol. 17,pp. 185-203, 1981.
[21] C. H. Teng, S. H. Lai, Y. S. Chen. Accurate optical flow computation under non-uniform brightness variations. Computer Vision and Image Understanding, vol. 97, no.3, pp. 315-346, 2005.
[22] C. K. Hsieh, S. H. Lai, Y. C. Chen. Expression-Invariant Face Recognition With Constrained Optical Flow Warping. IEEE Transactions on Multimedia, vol. 11, no. 4, pp. 600-610, 2009.
[23] R. Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov, A. Agarwala, M. Tappen, and C. Rother. A Comparative Study of Energy Minimization Methods for Markov Random Fields. In Ninth European Conference on Computer Vision (ECCV 2006), volume 2, pages 16-29, Graz, Austria, May 2006.
[24] Y. Boykov, O. Veksler, and R. Zabih. Fast Approximate Energy Minimization via Graph Cuts. In IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 23, no. 11, pages 1222-1239, November 2001.
[25] V. Kolmogorov and R. Zabih. What Energy Functions can be Minimized via Graph Cuts? In IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 26, no. 2, pages 147-159, February 2004. An earlier version appeared in European Conference on Computer Vision (ECCV), May 2002.
[26] Y. Boykov and V. Kolmogorov. An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision. In IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 26, no. 9, pages 1124-1137, September 2004.
[27] M. T. Pourazad, P. Nasiopoulos, R. K. Ward. An H.264-based Scheme for 2D to 3D Video Conversion. In IEEE Transactions on consumer Electronics, Vol. 55, No2, 2009.
[28] M. Bleyer, M. Gelautz. Temporally Consistent Disparity Maps from Uncalibrated Stereo Videos. In Image and Signal Processing and Analysis, 2009.
[29] Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines, 2001.
[30] Y. Taguchi, B. Wilburn, C. L. Zitnick. Stereo Reconstruction with Mixed Pixels Using Adaptive Over-Segmentation. In CVPR, 2008.
[31] G. Zhang, J. Jia, T. Wong and H. Bao. Consistent Depth Maps Recovery from a Video Sequence. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 31(6):974-988, 2009.
[32] C. C. Cheng, C.-T. Li, P.-S. Huang, T.-K. Lin, Y.-M. Tsai, and L.-G. Chen. A Block-based 2D-to-3D Conversion System with Bilateral Filter. International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, Jan. 2009.

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)

簡易檢索 / 詳目顯示

相關論文