使用雙攝影機系統實現之物體點擊影片去背以及深度輔助影像合成

簡易檢索 / 詳目顯示

回結果列表

研究生：	余若君 Yu, Jo Chun
論文名稱：	使用雙攝影機系統實現之物體點擊影片去背以及深度輔助影像合成 On-Click Video Matting and Depth-Aware Compositing using Stereo Camera
指導教授：	黃朝宗 Huang, Chao Tsung
口試委員:	賴永康 Lai, Yeong Kang 王家慶 Wang, Jia Ching
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2015
畢業學年度：	104
語文別：	中文
論文頁數：	70
中文關鍵詞：	影片去背、影像合成、雙攝影機、影像分割
外文關鍵詞：	Video matting, Compositing, Stereo camera, Segmentation
相關次數：	點閱：86 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

影像的去背(Matting)及合成(Compositing)並不缺乏好的演算法，然而影片去背需要由人力對每一張影像手動標示前景和背景，合成時則是因為不知道物體的真實大小以及場景的深度而需要手動調整物體大小才能進行貼圖。

我們的系統讓使用者在影片中的第一張影像能夠利用點擊(On-click)的方式快速選出需要去背的前景物體，之後的每一張影像由系統自動追蹤物體完成影片的去背(Video matting)。在分離出前景物體，並且由雙攝影機(stereo camera)系統計算得到物體的深度之後，便能在新的背景中以貼圖位置的深度重新將物體縮放到正確的大小進行貼圖(Compositing)。

實驗的結果使用者僅需要輸入1-3個click完成前景物體的選取，3-5個click完成需要去除的地面顏色的選取，之後系統便能自動完成5-15個frame的video matting。並且使用者只需要在新的場景中點擊貼圖的位置，系統便會自動把物體縮放到該深度相對應的大小進行貼圖。

Matting and compositing of images have been well-developed. However, matte a video needs to label foreground and background every single frame by users. Also the lack of knowledge of objects’ actual size and the depth information of the scene induce that a proper compositing result requires manual works for each frame.

With the aid of proposed system, users can choose the object and simply matte it with only few clicks. For the following frames, our system can track the chosen object and generate trimap sequence automatically for video matting. With the separated foreground objects and corresponding depth information, we can composite the object at any position of a scene with correct scale.

The result of experiment shows that, for video matting, users need only 1-3 click to select the object, and 3-5 click to remove the attached background segments. For video compositing, users can simply click desired position on a new scene, and the system can automatically resize the object to correct scale and complete video compositing.

Contents

摘 要...................................................i
Abstract................................................ii
Contents................................................iii
List of Figure..........................................v
List of Table...........................................ix

Chapter 1 Introduction..................................1
  1.1 Motivation........................................1
  1.2 Related works.....................................2
    1.2.1 Segmentation..................................2
    1.2.2 Matting.......................................5
    1.2.3 Compositing...................................12
  1.3 System Overview...................................13

Chapter 2 Pre-processing System.........................15
  2.1 Semi-global matching for disparity search.........16
  2.2 3DRS block matching for motion estimation.........19
  2.3 Superpixel and features...........................21
    2.3.1 Extenuate shadow affect.......................22
    2.3.2 Detectable object size........................24
    2.3.3 Integrate depth and motion features...........26

Chapter 3 Segmentation and Object tracking..............30
  3.1 Confidence-guided segmentation....................31
  3.2 Confidence-guided object tracking.................34

Chapter 4 On-Click Video Matting........................37
  4.1 Object selection..................................38
  4.2 Ground removal....................................38
  4.3 Trimap generation.................................41
  4.4 Closed-form matting...............................43

Chapter 5 Depth-Aware Compositing.......................44

Chapter 6 Experiment and Result.........................47
  6.1 Experimental setting..............................47
  6.2 Result............................................48
    6.2.1 Trimap generation.............................48
    6.2.2 On-click image matting........................50
    6.2.3 On-click video matting........................51
    6.2.4 Depth-aware compositing.......................53

Chapter 7 Conclusion and Future Work....................55
  7.1 Conclusion........................................55
  7.2 Future work.......................................56

References..............................................57


                                

[1] Z. Li, X. M. Wu, and S. F. Chang, ”Segmentation using superpixels: a bipartite graph partitioning approach,” IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Providence, Jun. 2012, pp. 789-796.

[2] C. Y. Hsu and J. J. Ding, “Efficient image segmentation algorithm using SLIC superpixels and boundary-focused region merging,” IEEE 9th International Conf. Information, Communications and Signal Processing (ICICS), Tainan, Dec. 2013, pp. 1-5.

[3] D. Weikersdorfer, D. Gossow, and M. Beetz, “Depth-adaptive superpixels,” IEEE 21st International Conf. Pattern Recognition (ICPR), Tsukuba, Nov. 2012, pp.2087-2090.

[4] A. Wedel, A. Meißner, C. Rabe, U. Franke, and D. Cremers, “Detection and Segmentation of Independently Moving Objects from Dense Scene Flow,” 7th International Conf. Energy Minimization Methods in Computer Vision and Pattern Recognition (EMMCVPR), Bonn, Germany, Aug. 2009, pp.14-27.

[5] P. F. Felzenszwalb and D. P. Huttenlocher, “Efficient graph-based image segmentation,” International Journal of Computer Vision (IJCV), vol. 59, no. 2, pp. 167-181, Sep. 2004.

[6] Maximum flow problem. [online] Available: https://en.wikipedia.org/wiki/Maxim um_flow_problem.

[7] R. Achanta, A. Shaji, K. Smith, and A. Lucchi, “SLIC superpixels compared to state-of-the-art superpixel methods,” IEEE Trans. Pattern Analysis and Machine Intelligence (TPAMI), vol. 34, no. 11, pp. 2274-2282, May 2012.

[8] I. Giosan and S. Nedevschi, “Superpixel-based obstacle segmentation from dense stereo urban traffic scenarios using intensity, depth and optical flow information,” IEEE 17th International Conf. Intelligent Transportation Systems (ITSC), Qingdao, Oct. 2014, pp.1662-1668.

[9] K. Yamaguchi, D. McAllester, and R. Urtasun, ”Efficient joint segmentation, occlusion labeling, stereo and flow estimation,” 13th European Conf. Computer Vision (ECCV), Zurich. Switzerland, Sep. 2014, Part V, pp. 756-771.

[10] D. Pfeiffer, F. Erbs, and U. Franke, “Pixels, stixels, and objects,” European Conf. Computer Vision (ECCV) Workshops and Demonstrations, Florence. Italy, Oct. 2012, Part III, pp. 1-10.

[11] C. Rother, V. Kolmogorov, and A. Blake, ” "GrabCut": interactive foreground extraction using iterated graph cuts,” ACM Trans. Graphics (TOG), vol. 23, no. 3, pp. 309-314, Aug. 2004.

[12] E. N. Mortensen and W. A. Barrett, “Interactive segmentation with intelligent scissors,” Graphical Models and Image Processing, vol. 60, no. 5, pp. 349-384, Sep. 1998.

[13] M. Kass, A. Witkin, and D. Terzopoulos, “Snakes: active contour models,” International Journal of Computer Vision, vol. 1, no. 4, pp. 321-331, Jan. 1988.

[14] J. Wang, M. Agrawala, and M. F. Cohen, “Soft scissors: an interactive tool for realtime high quality matting,” ACM Trans. Graphics (TOG), vol. 26, no. 3, Jul. 2007.

[15] M. Agrawala. (2009, Sep 9). Soft scissors : an interactive tool for realtime high quality matting. [Online]. Available: https://www.youtube.com/watch?v=8YQ7Z3TW GnY.

[16] Y. Y. Chuang, B. Curless, D. H. Salesin, and R. Szeliski, “A bayesian approach to digital matting,” IEEE Conf. Computer Vision and Pattern Recognition (CVPR), vol. 2, no. 264-271, 2001.

[17] J. Sun, J. Jia, C. K. Tang, and H. Y. Shum, “Poisson matting,” ACM Trans. Graphics (TOG), vol. 23, no. 3, pp. 315-321, Aug. 2004.

[18] L. Grady, T. Schiwietz, S. Aharon, and R. Westermann, ”Random walks for interactive alpha-matting,” Visualization, Imaging, and Image Processing, 2005.

[19] X. Bai and G. Sapiro, “A geodesic framework for fast interactive image and video segmentation and matting,” IEEE 11th International Conf. on Computer Vision (ICCV), Rio de Janeiro, Oct. 2007, pp. 1-8.

[20] A. Levin, D. Lischinski, and Y. Weiss, “A closed-form solution to natural image matting,” IEEE Trans. Pattern Analysis and Machine Intelligence (TPAMI), vol. 30, no. 2, pp. 228-242, Feb. 2008.

[21] J. Wang and M. F. Cohen, “Optimized color sampling for robust matting,” IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Minneapolis, Jun. 2007.

[22] N. Joshi, W. Matusik, and S. Avidan, “Natural video matting using camera arrays,” ACM Trans. Graphics (TOG), vol. 25, no. 3, pp. 779-786, Jul. 2006.

[23] J. Wang and M. F. Cohen, “Image and video matting: a survey,” Computer Graphics and Vision, vol. 3, no. 2, pp. 97-175, Jan. 2007.

[24] Y. Li, J. Sun, and H. Y. Shum, “Video object cut and paste,” ACM Trans. Graphics (TOG), vol. 24, no.3, pp. 595-600, Jul. 2005.

[25] E. Shahrian, B. Price, S. Cohen, and D. Rajan, “Temporally coherent and spatially accurate video matting,” Computer Graphics Forum, vol. 33, no. 2, pp.381-390, May 2014.

[26] E. Shahrian. (2014, Sep 29). Temporally Coherent and Spatially Accurate Video Matting. [Online]. Available: https://www.youtube.com/watch?v=YYFa6fMV9kw.

[27] S. Y. Lee, J. C. Yoon, and I. K. Lee, “Temporally coherent video matting,” Graphical Models, vol. 72, no. 3, pp. 25-33, May 2014.

[28] Y. Y. Chuang, A. Agarwala, B. Curless, D. H. Salesin, and R. Szeliski, “Video matting of complex scenes,” ACM Trans. Graphics (TOG), vol. 21, no. 3, pp. 243-248, Jul. 2002.

[29] J. Wang, P. Bhat, R. A. Colburn, M. Agrawala, and M. F. Cohen, “Interactive video cutout,” ACM Trans. Graphics (TOG), vol. 24, no. 3, pp. 585-594, Jul. 2005.

[30] S. D. Jain and K. Grauman, “Supervoxel-consistent foreground propagation in video,” 13th European Conf. Computer Vision (ECCV), Zurich. Switzerland, Sep. 2014, Part IV, pp. 656-671.

[31] J. Chang, W. Donglai, J. W. Fisher, “A video representation using temporal superpixels,” IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Portland, Jun. 2013, pp. 2051-2058.

[32] J. F. Lalonde, D. Hoiem, A. A. Efros, C. Rother, J. Winn, and A. Criminisi, “Photo clip art,” ACM Trans. Graphics (TOG), vol. 26, no. 3, Jul. 2007.

[33] H. Hirschmuller, “Accurate and efficient stereo processing by semi-global matching and mutual information,” IEEE Conf. Computer Vision and Pattern Recognition (CVPR), vol. 2, Jun. 2005, pp. 807-814.

[34] H. Hirschm¨uller, “Stereo vision in structured environments by consistent semi-global matching,” IEEE Conf. Computer Vision and Pattern Recognition (CVPR), vol. 2, 2006, pp. 2386-2393.

[35] G. de Haan, P. W. A. C. Biezen, H. Huijgen, O. A. Ojo, “True-motion estimation with 3-D recursive search block matching,” IEEE Trans. Circuits and Systems for Video Technology, vol. 3, no. 5, pp. 368-379, 388, Oct. 1993.

[36] M. Phadtare, “Motion estimation techniques in video processing,” Electronic Engineering Times India, pp. 1-4, Aug. 2007.

[37] C. N. Cordes and G. de Haan, “Key Requirements for high quality picture-rate conversion,” SID Symposium Digest of Technical Papers, vol. 40, no. 1, pp. 850-853, Jun. 2009.

[38] Color Models CIELab. [online] Available: http://dba.med.sc.edu/price/irf/Adobe_ tg/models/cielab.html.

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)

簡易檢索 / 詳目顯示

相關論文