簡易檢索 / 詳目顯示

研究生: 王昱智
Wang, Yu-Chih
論文名稱: 應用於雙視角轉多視角系統之小波式生成引擎演算法與硬體架構
Algorithm and VLSI Architecture of Wavelet-Based Rendering Engine for Stereo-to-Multiview Conversion System
指導教授: 黃朝宗
Huang, Chao-Tsung
口試委員: 張添烜
Chang, Tian-Sheuan
邱瀞德
Chiu, Ching-Te
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2018
畢業學年度: 107
語文別: 英文
論文頁數: 61
中文關鍵詞: 視點合成超大型積體電路小波式生成相位式訊號處理相位式
外文關鍵詞: View Synthesis, VLSI, Wavelet-Based Rendering, Phase-Based Signal Processing, Phase-Based
相關次數: 點閱:4下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著計算機視覺高速進步與發展下,高品質且穩定的多視角影像顯得越來越重要。裸眼式三維立體電視可以播放多視角的影像資訊來提供給使用者產生三維立體影像的體驗。針對此裸眼式三維立體電視系統,它真正需要的是一個可以產生出不同視點影像的視角合成系統而非將所有需求的影像存入巨大的記憶體空間中。我們可以將虛擬視角合成系統區大致區分成兩種,第一種是廣為研究者所研究與討論的像素式合成系統;第二種是截至目前為止比較少人研究的小波式合成系統。

    在我們的認知裡,針對小波式合成系統之進一步的硬體架構設計與討論是幾乎沒有的。此外,像素式合成系統所產生出來的影像品質跟深度圖的好壞有著非常直接的關係,但一個好的深度圖往往需要一個非常複雜且高度人為介入的流程。因此,在這篇研究論文當中,我們主要將重心放在關於小波式合成系統的演算法分析和硬體架構設計與實現上。在演算法當中,首先,我們要先建立每一個小波的深度,緊接著,根據每一個小波的深度與視點來產生出新的小波。另外,我們提出了一個可以同時產生多個視角影像小波的硬體架構,其大幅地節省晶片內部的記憶體空間。

    最後,應用於雙視角轉多視角系統的小波式生成引擎是透過台積電40 奈米製程所合成,此電路使用了6 千位元組的晶片內部的記憶體以及58.3 萬的邏輯閘。運作在300 MHz 時,它每秒可以提供十二億小波點的吞吐量來支援45 幀的相位式視角合成系統。


    High-quality multi-view synthesis becomes more and more important in emerging computer vision applications. Multi-view autostereoscopic 3DTV needs different viewpoint information to provide user 3D experience. For multi-view autostereoscopic 3DTV system, it needs a view synthesis system to interpolate different viewpoints of multi-view instead of storing them into massive memory space. There are two different view synthesis algorithms to synthesize views, one is pixel-based rendering that many studies have been conducted on this traditional method, and the other is wavelet-based rendering which is a novel technology that little research has devoted on this novel approach.

    To the best of the author’s knowledge, the hardware design of wavelet-based rendering is hardly discussed. In addition, image quality from pixel-based rendering is proportional to disparity quality which is a complicated algorithm and highly artificial intervention process. Therefore, this study focuses on algorithm analysis and hardware architecture design of wavelet-based rendering. First, we estimate each wavelet disparity based on phase information. Second, target wavelets are computed according to viewpoints and disparity of source wavelets. We propose an architecture that can synthesize multi-wavelet of multi-view synchronously and further reduces on-chip memory.

    We implement a VLSI circuit for wavelet-based rendering engine using TSMC 40nm technology process with 6 KB on-chip memory and 583 K gate counts. When synthesized at 300 MHz, it delivers 1.2 G wavelet/s to support wavelet-based view synthesis system at 45 fps.

    Abstract ii 1 Introduction 1 1.1 Research Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Related Work 4 2.1 Stereo Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Pixel-Based Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Wavelet-Based Rendering . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3.1 Phase-Based Motion Magnification . . . . . . . . . . . . . . . . . . 9 2.3.2 Phase-Based View Expansion . . . . . . . . . . . . . . . . . . . . . 10 2.3.3 Phase-Based Frame Interpolation . . . . . . . . . . . . . . . . . . . 11 2.3.4 Eulerian-Lagrangian Stereo-to-Multiview Conversion . . . . . . . . . 13 3 Algorithm Analysis of Wavelet-Based Stereo-to-Multiview Synthesis 15 3.1 Wavelet-Based Stereo-to-Multiview Synthesis Overview . . . . . . . . . . 15 3.2 Wavelet Disparity Estimation . . . . . . . . . . . . . . . . . . . . . . 17 3.2.1 Analysis of Different Disparity Correction . . . . . . . . . . . . . 24 3.3 Wavelet Re-Projection Filter . . . . . . . . . . . . . . . . . . . . . . 26 3.3.1 Linear Interpolation . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3.2 Non-Uniform Fast Fourier Transform . . . . . . . . . . . . . . . . . 28 3.3.3 Lanczos Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.3.4 Analysis of Support Window Size . . . . . . . . . . . . . . . . . . . 29 4 Architecture Design of Wavelet-Based Rendering Engine 32 4.1 System Architecture of Wavelet-Based Rendering . . . . . . . . . . . . . 32 4.2 Wavelet Disparity Estimation Engine . . . . . . . . . . . . . . . . . . . 33 4.2.1 Disparity Update Unit . . . . . . . . . . . . . . . . . . . . . . . . 34 4.2.2 Position Update Unit and Winner-Take-All Unit . . . . . . . . . . . . 36 4.2.3 CORDIC Arctan Unit . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.3 Wavelet Re-projection Estimation Engine . . . . . . . . . . . . . . . . . 45 4.3.1 Position Warping Unit, Coefficient Mapping Unit, and Wavelet Factor Mapping Unit . . . 45 4.3.2 Wavelet Update Unit . . . . . . . . . . . . . . . . . . . . . . . . . 46 5 Implementation of Wavelet-Based Rendering 51 5.1 Precision Issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5.2 Synthesized Result . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 6 Conclusion and Future 56

    [1] N. A. Dodgson, “Autostereoscopic 3D displays,” IEEE Computer, vol. 38, no. 8, pp. 31–36, Aug 2005.
    [2] N. A. Dodgson, “Analysis of the viewing zone of multiview autostereoscopic displays,” in Stereoscopic displays and virtual reality systems IX. International Society for Optics and Photonics, 2002, vol. 4660, pp. 254–266.
    [3] S. Guo, P. Surman, Z. Zhuang, and X. W. Sun, “A real-time head tracker for autostereoscopic display,” in 2015 Visual Communications and Image Processing (VCIP), Dec 2015, p. 1.
    [4] Y. Takaki, “Super multi-view display and holographic display,” in 2009 IEEE LEOS Annual Meeting Conference Proceedings, Oct 2009, pp. 12–13.
    [5] P. Kellnhofer, P. Didyk, S.-P. Wang, P. Sitthi-Amorn, W. Freeman, F. Durand, and W. Matusik, “3DTV at home: Eulerian-Lagrangian stereo-tomultiview conversion,” ACM Transactions on Graphics, vol. 36, no. 4, pp. 146, 2017.
    [6] D. Scharstein, “Stereo vision for view synthesis,” in Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition, June 1996, pp. 852–858.
    [7] H.-Y. Shum and S. B. Kang, “Review of image-based rendering techniques,” in Visual Communications and Image Processing 2000. International Society for Optics and Photonics, 2000, vol. 4067, pp. 2–14.
    [8] L. McMillan and G. Bishop, “Head-tracked stereoscopic display using image warping,” in Stereoscopic Displays and Virtual Reality Systems II. International Society for Optics and Photonics, 1995, vol. 2409, pp. 21–31.
    [9] W. R. Mark, L. McMillan, and G. Bishop, “Post-rendering 3D warping,” in Proceedings of the 1997 symposium on Interactive 3D graphics. ACM, 1997, pp. 7–255.
    [10] C. Zhu, Y. Zhao, L. Yu, and M. Tanimoto, “3D-TV system with depth-image-based rendering,” 2014.
    [11] M. Tanimoto, T. Fujii, and K. Suzuki, “View synthesis algorithm in view synthesis reference software,” Tech. Rep. Document M16090, SO/IEC JTC1/SC29/WG11, 2009.
    [12] C. Riechert, F. Zilly, P. Kauff, J. Güther, and R. Schäfer, “Fully automatic stereo-to-multiview conversion in autostereoscopic displays,” in Proc. IBC, 2012, pp. 8–14.
    [13] H. Chen, F. Lo, F. Jan, and S. Wu, “Real-time multi-view rendering architecture for autostereoscopic displays,” in Proceedings of 2010 IEEE International Symposium on Circuits and Systems, May 2010, pp. 1165–1168.
    [14] C. Bal, “Comparison of Depth Image-Based Rendering and Image Domain Warping in 3D Video Coding,” M.S. thesis, University of California, 2014.
    [15] N. Stefanoski, O. Wang, M. Lang, P. Greisen, S. Heinzle, and A. Smolic, “Automatic view synthesis by image-domain-warping,” IEEE Transactions on Image Processing, vol. 22, no. 9, pp. 3329–3341, Sept 2013.
    [16] Y. Horng, Y. Tseng, and T. Chang, “VLSI architecture for Real-Time HD1080p view synthesis engine,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 21, no. 9, pp. 1329–1340, Sept 2011.
    [17] M. Schaffner, F. K. Gürkaynak, H. Kaeslin, L. Benini, and A. Smolic, “Automatic multiview synthesis —towards a mobile system on a chip,” in 2015 Visual Communications and Image Processing (VCIP), Dec 2015, pp. 1–4.
    [18] M. Schaffner, F. K. Gürkaynak, P. Greisen, H. Kaeslin, L. Benini, and A. Smolic, “Hybrid ASIC/FPGA system for fully automatic stereo-tomultiview conversion using IDW,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no. 11, pp. 2093–2108, Nov 2016.
    [19] E. P. Simoncelli and W. T. Freeman, “The steerable pyramid: a flexible architecture for multi-scale derivative computation,” in Proceedings., International Conference on Image Processing, Oct 1995, vol. 3, pp. 444–447.
    [20] H. Greenspan, S. Belongie, R. Goodman, P. Perona, S. Rakshit, and C. H. Anderson, “Overcomplete steerable pyramid filters and rotation invariance,” in 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, June 1994, pp. 222–228.
    [21] D. J. Fleet, A. D. Jepson, and M. R.M. Jenkin, “Phase-based disparity measurement,” CVGIP: Image Understanding, vol. 53, no. 2, pp. 198–210, 1991.
    [22] H.-Y. Wu, M. Rubinstein, E. Shih, J. Guttag, F. Durand, and W. T. Freeman, “Eulerian video magnification for revealing subtle changes in the world,” ACM Transactions on Graphics, vol. 31, no. 4, 2012.
    [23] N. Wadhwa, M. Rubinstein, F. Durand, and W. T. Freeman, “Phase-based video motion processing,” ACM Transactions on Graphics, vol. 32, no. 4, 2013.
    [24] P. Didyk, P. Sitthi-Amorn, W. T. Freeman, F. Durand, and W. Matusik, “Joint view expansion and filtering for automultiscopic 3D displays,” ACM Transactions on Graphics, vol. 32, no. 6, pp. 221, 2013.
    [25] S. Meyer, O. Wang, H. Zimmer, M. Grosse, and A. Sorkine-Hornung, “Phase-based frame interpolation for video,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition, June 2015, pp. 1410–1418.
    [26] J. Portilla and E. P. Simoncelli, “A parametric texture model based on joint statistics of complex wavelet coefficients,” International journal of computer vision, vol. 40, no. 1, pp. 49–70, 2000.
    [27] S. Wanner, S. Meister, and B. Goldluecke, “Datasets and benchmarks for densely sampled 4D light fields.,” in VMV. Citeseer, 2013, pp. 225–226.
    [28] W. T. Freeman and E. H. Adelson, “The design and use of steerable filters,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, no. 9, pp. 891–906, Sept 1991.
    [29] H.-C. Huang, “VLSI System Implementation of Phase-Based View Synthesis for 4K Ultra-HD 3DTV,” M.S. thesis, National Tsing Hua University, 2018.
    [30] Q. H. Liu and N. Nguyen, “An accurate algorithm for nonuniform fast Fourier transforms (NUFFT’s),” IEEE Microwave and Guided Wave Letters, vol. 8, no. 1, pp. 18–20, Jan 1998.
    [31] J. E. Volder, “The CORDIC trigonometric computing technique,” IRE Transactions on Electronic Computers, vol. 8, no. 3, pp. 330–334, Sept 1959.
    [32] P. K. Meher, J. Valls, T. Juang, K. Sridharan, and K. Maharatna, “50 years of CORDIC: algorithms, architectures, and applications,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 56, no. 9, pp. 1893–1907, Sept 2009.

    QR CODE