研究生: |
陳曉薇 Chen, Hsiao-Wei |
---|---|
論文名稱: |
Recovering depth map from video with moving objects 從包含移動物體的影片中重建深度圖 |
指導教授: |
賴尚宏
Lai, Shang-Hong |
口試委員: |
劉庭祿
Liu, Tyng-Luh 陳煥宗 Chen, Hwann-Tzong |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2011 |
畢業學年度: | 99 |
語文別: | 英文 |
論文頁數: | 46 |
中文關鍵詞: | 深度估測 、基於運動的三維重建 、馬可夫隨機域 |
外文關鍵詞: | Depth estimation, structure from motion, Markov random field |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
In this thesis, we propose a novel approach to reconstructing depth map from a video sequence, which not only considers geometry coherence but also temporal coherence. Most of the previous methods of reconstructing depth map from video are based on the assumption of rigid motion, thus they cannot provide satisfactory depth estimation for regions with moving objects. In this work, we develop a depth estimation algorithm that detects regions of moving objects and recover the depth map in a Markov Random Field framework. We first apply SIFT matching across frames in the video sequence and compute the camera parameters for all frames and the 3D positions of the SIFT feature points via structure from motion. Then, the 3D depths at these SIFT points are propagated to the whole image based on image over-segmentation to construct an initial depth map. Then the depth values for the segments with large re-projection errors are refined by minimizing the corresponding re-projection errors. In addition, we detect the area of moving objects from the remaining pixels with large re-projection errors. In the final step, we optimize the depth map estimation in a Markov random filed framework. Some experimental results are shown to demonstrate improved depth estimation results of the proposed algorithm.
在這篇論文中,我們提出了一個從影片中建立深度圖的方法,不僅考慮了幾何空間上的一致性,同時也加入了時間空間上的資訊。在之前有關從影片中重建深度圖的方法裡,大部分都假設影片中的景物為靜態物體,也因此這些方法無法預估影片中有移動物體的深度資訊。在我們提出的方法裡,能夠偵測移動物體的區域並且利用馬可夫隨機架構來幫助我們從影片中得到更好的深度圖。首先我們採用尺度不變特徵轉換找到影片中每個影格相對應的特徵點,並且利用它來實做基於運動的三維重建得到每個影格的相機參數和所相對應的3D點座標。接著利用這些3D點和影像分割的資訊將3D資訊擴張到整張影像,此為初始的深度圖。利用重新投影的方法找出錯誤較大的深度區塊進行校正,找出使重新投影錯誤值最小的深度當作校正後的深度值。另外我們以重新投影錯誤值較大的部分當作初始位置,幫助找出移動物體所在的完整區塊。最後利用馬可夫隨機架構來得到最佳化深度圖。我們提出的方法會在此篇論文實驗結果的部分展示改善後的估測深度圖。
[1] A. Saxena, M. Sun, and A. Y. Ng. Make3D: Learning 3D Scene Structure from a Single Still Image. In IEEE Trans. on Pattern Analysis and Machine Intelligence, 2008.
[2] B. Liu, S. Gould, D. Koller. Single Image Depth Estimation From Predicted Semantic Labels. In Proc. International Conference on Computer Vision and Pattern Recognition, 2010.
[3] G. Zhang, J. Jia, T. Wong, H. Bao, Recovering Consistent Video Depth Maps via Bundle Optimization. In Proc. International Conference on Computer Vision and Pattern Recognition, 2008.
[4] G. Zhang, J. Jia, T. Wong and H. Bao. Consistent Depth Maps Recovery from a Video Sequence. In IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009.
[5] S. M. Seitz, B. Curless, J. Diebel, D. Scharstein and R. Szeliski. A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms. In Proc. International Conference on Computer Vision and Pattern Recognition, 2006.
[6] R. A. Newcombe and A. J. Davison. Live Dense Reconstruction with a Single Moving Camera. In Proc. International Conference on Computer Vision and Pattern Recognition, 2010.
[7] D. Comanicu and P. Meer. Mean shift: A robust approach toward feature space analysis. In IEEE Trans. on Pattern Analysis and Machine Intelligence, May 2002.
[8] P. F. Felzenszwalb and D. P. Huttenlocher. Efficient Graph-Based Image Segmentation. In International Journal of Computer Vision, 2004.
[9] D. Hoiem, A. A. Efros and M. Hebert. Recovering Occlusion Boundaries from an Image, In International Journal of Computer Vision, 2010.
[10] J. Sun, H. Y. Shum, and N. N. Zheng. Stereo matching using belief propagation. In Proc. European Conference on Computer Vision, 2002.
[11] P. Felzenszwalb and D. Huttenlocher. Efficient belief propagation for early vision. In International Journal of Computer Vision, 2007.
[12] O. Pele, M. Werman. A Linear Time Histogram Metric for Improved SIFT Matching. In Proc. European Conference on Computer Vision, 2008.
[13] D. Martinec, T. Pajdla. 3D Reconstruction by Fitting Low-Rank Matrices with Missing Data. In Proc. International Conference on Computer Vision and Pattern Recognition, 2005.
[14] M. Pollefeys, L. Van Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and R. Koch. Visual modeling with a hand-held camera. In International Journal of Computer Vision, 2004.
[15] K. Alsabti, S. Ranka and V. Singh. An Efficient k-means Clustering Algorithm. In Pattern Recognit. Lett., vol. 14, no. 10, pp. 763 - 769, 1993.
[16] V. Hedau, D. Hoiem, and D. Forsyth. Recovering the Spatial Layout of Cluttered Rooms”, In Proc. International Conference on Computer Vision, 2009.
[17] R. Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov, A. Agarwala, M. Tappen, and C. Rother. A Comparative Study of Energy Minimization Methods for Markov Random Fields. In Proc. European Conference on Computer Vision, 2006.
[18] Y. Boykov, O. Veksler, and R. Zabih. Fast Approximate Energy Minimization via Graph Cuts. In IEEE Trans. on Pattern Analysis and Machine Intelligence, 2001.
[19] V. Kolmogorov and R. Zabih. What Energy Functions can be Minimized via Graph Cuts?. In IEEE Trans. on Pattern Analysis and Machine Intelligence, 2004.
[20] G. Um, G. Bang, N. Hur, J. Kim and Y.-S. Ho. Test Sequence “Lovebird1&2.
[21] M. Domański, T. Grajek, K. Klimaszewski, M. Kurc, O. Stankiewicz, J. Stankowski and K. Wegner. Poznań Multiview Video Test Sequences and Camera Parameters. ISO/IEC JTC1/SC29/WG11 MPEG 2009/M17050, Xian, China, October 2009.