研究生: |
黃世耀 Huang, Shih-Yao |
---|---|
論文名稱: |
應用於相位訊號處理之複數可控金字塔有限脈衝響應濾波器設計 FIR Filter Design of Complex Steerable Pyramid for Phase-based Processing |
指導教授: |
黃朝宗
Huang, Chao-Tsung |
口試委員: |
賴永康
Lai, Yeong-Kang 劉奕汶 Liu, Yi-Wen |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2020 |
畢業學年度: | 108 |
語文別: | 英文 |
論文頁數: | 56 |
中文關鍵詞: | 複數可控金字塔 、有限脈衝響應濾波器 、相位訊號處理 |
外文關鍵詞: | CSP, FIR-Filter, Phase-based |
相關次數: | 點閱:1 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
相位訊號處理被廣泛使用於調整移動量的應用上,例如:視點合成及影格內插。有別於傳統演算法對於複雜估計演算法的需求,相位處理的方法有效率的調整局部的相位來達到移動的效果,為了得到局部的相位,我們需要複數可控金字塔將影像分解成多尺度且有方向性的頻帶,而頻帶在時域上是複數係數,接著相位可以輕易的藉由反正切的運算算出,傳統複數可控金字塔的實作方式是藉由快速傅立葉轉換將時域訊號轉至頻域,再透過帶通濾波器濾波,然而,由於我們對於在嵌入式系統中即時運算的需求,在快速傅立葉轉換的使用下,需要較多的單晶片內建記憶體儲存旋轉因子以及頻域濾波器的係數,甚至在二維的情形,二維可分離的快速傅立葉轉換會需要很高的記憶體頻寬,以上的問題導致頻域濾波的做法在實作上是相當昂貴的。
在本論文中,我們的目標是設計高硬體效率且用於一維、二維複數可控金字塔的分解及重建濾波器,其中,濾波器包含分出各向同性頻帶的徑向濾波器以及進一步分解具有方向性頻帶的角度濾波器,為了避免快速傅立葉轉換的使用,我們採取時域濾波的方式減少單晶片內建記憶體的使用量,且在二維可以節省記憶體頻寬,此外,我們設計低點數的有限脈衝響應濾波器濾波器更進一步的減少運算複雜度。有別於傳統有限脈衝響應濾波器的設計方法,應用於複數可控金字塔的濾波器需要盡可能的符合完美重建的特性,如此一來才可以最小化影像重建的損失,所以,我們提出一個具有不錯重建品質的有限脈衝響應濾波器設計方法,而我們利用峰值信噪比的方式評估影像的品質。我們使用凱澤窗的濾波器設計方法,並且最後選用9點的徑向濾波器以及11點的角度濾波器,在一維視點合成的應用中達到36.37分貝的峰值信噪比,且在二維影格內插法的應用中達到33.46分貝的峰值信噪比。
Phase-based processing is widely used in motion adjustment applications, such as view-synthesis and video frame-interpolation. Instead of complex estimation algorithms used by conventional methods, phase-based method efficiently manipulates the local phase to translate the motion. In order to obtain the local phase, complex steerable pyramid is used to decompose images into multi-scale and oriented subbands with complex coefficients then the phase can be easily computed by arctangent. The conventional implementation is based on frequency-domain bandpass filtering which relies on fast Fourier transform (FFT) . However, with the purpose of real-time processing in embedded system, it is costly to implement because of large on-chip memory requirement for twiddle factors and filter coefficients caused by FFT and frequency domain filtering. Moreover, in 2-D case, high memory bandwidth is required for 2-D separable FFT.
In this thesis, we aim to design a hardware-efficient decomposition and reconstruction filters based on complex steerable pyramid for 1-D and 2-D real-time phase-based processing system. Our filters contain radial filters for isotropic frequency bands and angular filters for further detecting oriented ones. To avoid using FFT, we filter in spatial domain
to reduce on-chip memory usage, and memory bandwidth in 2-D case. In addition, we design low-tap finite-impulse response (FIR) filters to further reduce computational complexity. Different from traditional FIR design methodology, filters for complex steerable pyramid have to comply with perfect reconstruction property as possible as we can to reconstruct images with minimum loss. Therefore, we propose a design methodology for FIR filters with good reconstruction quality measured by peak signal-to-noise ratio (PSNR).
We use Kaiser windowing for our filter design and finally adopt 9-tap radial and 11-tap angular filters
which can achieve 36.37 dB of PSNR for 1-D view synthesis, 33.46 dB of PSNR for 2-D video frame interpolation.
[1] N. Wadhwa, M. Rubinstein, F. Durand, and W. T. Freeman, “Phase-based
video motion processing,” ACM Transactions on Graphics, vol. 32, no. 4, pp.
80, 2013.
[2] S. Meyer, O. Wang, H. Zimmer, M. Grosse, and A. Sorkine-Hornung, “Phasebased
frame interpolation for video,” IEEE Conference on Computer Vision
and Pattern Recognition, pp. 1410–1418, June 2015.
[3] P. Kellnhofer, P. Didyk, S. P. Wang, P. Sitthi-Amorn, W. T. Freeman,
F. Durand, and W. Matusik, “3DTV at home: eulerian-lagrangian stereo-tomultiview
conversion,” ACM Transactions on Graphics, vol. 36, no. 4, pp.
146, 2017.
[4] H. Huang, Y. Wang, W. Chen, P. Lin, and C. Huang, “System and vlsi implementation
of phase-based view synthesis,” IEEE International Conference
on Acoustics, Speech and Signal Processing, pp. 1428–1432, May 2019.
[5] E. P. Simoncelli and W. T. Freeman, “The steerable pyramid: a flexible architecture
for multi-scale derivative computation,” Proceesings of International
Conference on Image Processing, vol. 3, pp. 444–447, Oct. 1995.
[6] J. Portilla and E. Simoncelli, “A parametric texture model based on joint
statistics of complex wavelet coefficients,” International Journal of Computer
Vision, vol. 40, Oct. 2000.
[7] H. Wu, M. Rubinstein, E. Shih, J. Guttag, F. Durand, and W. T. Freeman,
“Eulerian video magnification for revealing subtle changes in the world,” ACM
Transactions on Graphics, vol. 31, no. 4, 2012.
[8] P. Didyk, P. Sitthi-Amorn, W. T. Freeman, F. Durand, and W. Matusik,
“Joint view expansion and filtering for automultiscopic 3D displays,” ACM
Transactions on Graphics, vol. 32, no. 6, pp. 221, 2013.
[9] N. Wadhwa, M. Rubinstein, F. Durand, and W. T. Freeman, “Riesz pyramids
for fast phase-based video magnification,” IEEE International Conference on
Computational Photography, 2014.
[10] Neal Wadhwa, Michael Rubinstein, Fredo Durand, and William Freeman,
“Quaternionic representation of the riesz pyramid for video magnification,”
Technical report, MIT Computer Science and Artificial Intelligence Laboratory,
2014.
[11] S. Wanner, M. Sven, and B. Goldluecke, “Datasets and benchmarks for
densely sampled 4D light fields.,” Annual Workshop on Vision, Modeling and
Visualization: VMV, pp. 225–226, 2013.
[12] Jae S. Lim, Two-dimensional Signal and Image Processing, Prentice-Hall,
Inc., Upper Saddle River, NJ, USA, 1990.
[13] K. Honauer, O. Johannsen, D. Kondermann, and B. Goldluecke, “A dataset
and evaluation methodology for depth estimation on 4D light fields,” Asian
Conference on Computer Vision, pp. 19–34, 2016.
[14] G. Bjontegard, “Calculation of average psnr differences between rd-curves,”
ITU-T VCEG-M33, April 2001.