簡易檢索 / 詳目顯示

研究生: 徐銘聰
Hsu, Ming-Tsung
論文名稱: 利用增強型的環形麥克風陣列和球形麥克風陣列產生頭部轉移函數的濾波輸出
Head related transfer functions filtered outputs obtained using an augmented circular microphone array and a spherical microphone array
指導教授: 白明憲
Bai, Ming-Sian R.
口試委員: 洪健中
Hong, Chien-Chong
丁川康
Ting, Chuan-Kang
學位類別: 碩士
Master
系所名稱: 工學院 - 動力機械工程學系
Department of Power Mechanical Engineering
論文出版年: 2018
畢業學年度: 107
語文別: 英文
論文頁數: 53
中文關鍵詞: 增強型的環形麥克風陣列球型麥克風陣列頭部轉移函數
外文關鍵詞: augmented circular microphone array, spherical microphone array, head related transfer functions
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 由於傳統圓型麥克風陣列不能進行仰角的定位,對仰角不敏感,且傳統垂直式線性麥克風陣列則是只能進行仰角定位,對水平角不敏感。本論文提出增強型的環形麥克風陣列,將一個對數間距的線性陣列垂直放置在圓型麥克風陣列的中心點上進行三維聲場之定位並和球型麥克風陣列來比較,其中圓型麥克風陣列架構為24個等間距的麥克風散佈在圓環上,對數間距的線性陣列架構為8個對數間距的麥克風垂直分布在圓環中心點上,而球型麥克風陣列為32個麥克風根據特定位置分布。定位階段利用最小變易無失真響應法、多重信號分類法與正交匹配追踪法來進行聲源定位,三個方法都有不錯的定位效果。定位結果以演算法來區分的話,正交匹配追蹤法運用的時間最久。分離階段運用梯克諾夫正規化法、有效的凸集合最佳化及正交匹配追踪法來進行分離。分離結果以有效的凸集合最佳化及正交匹配追踪法有較好的分離度,但失真較為嚴重,梯克諾夫正規化法則相反。定位及分離以模型來比較的話,球形麥克風陣列分析時間高出增強型的環形麥克風陣列。最後將分離出來的音檔配合定到位置的聲源角度利用頭部轉移函數轉換成左右聲道輸入耳機,使得聽覺上能夠有方向性。配合聆聽測試的結果,均有不錯的方向性。


    A conventional uniform circular microphone array (UCA) cannot separate the signal in the zenith angle, whereas a vertical uniform circular microphone array (ULA) cannot extract the signals in the azimuth. This thesis presents an augmented circular microphone array (CMA) consisting of an unbaffled CMA with a logarithmic-spacing linear array (LLA) as the vertical extension to analyze the 3-D sound field. A spherical microphone array (SMA) consisting of 32 microphones is utilized as a benchmarking method for the augmented CMA above. However the CMA architecture has 24 equally spaced microphones mounted on the surface and LLA architecture has 8 logarithmically spaced microphones perpendicularly standing above the center of the CMA. The methods of localization use minimum power distortionless response (MPDR), multiple signal classification (MUSIC) and orthogonal matching pursuit (OMP). All of them have a good effect of localization. The OMP takes the longest time for localization. The methods of separation use Tikhonov regularization (TIKR), Convex optimization (CVX) and OMP. The separation results of CVX and OMP have better performance, but the distortion is more serious. However, the TIKR is the opposite result. Finally, by using the head related transfer functions (HRTF), the source signals are converted into the left and right channels for the headset. The results of the listening test have demonstrated the efficacy of binaural rendering by using the proposed system.

    摘要 I 誌謝 III TABLE OF CONTENTS IV LIST OF FIGURES VI LIST OF TABLES IX Chapter 1 INTRODUCTION 1 Chapter 2 ARRAY SIGNAL PROCESSING 4 2.1 An augmented CMA 4 2.1.1 Farfield array model 4 2.1.2 Spatial correlation matrix 5 2.1.3 The steering matrix of an augmented CMA 6 2.2 SMA 7 2.2.1 Spherical harmonics used to expand plane wave 7 2.2.2 Arrays formulated in the space domain 8 Chapter 3 ARRAY FORMULATIONS 13 3.1 Two stage methods 13 3.1.1 MPDR algorithm 13 3.1.2 MUSIC algorithm 15 3.1.3 TIKR method 16 3.2 One stage methods 17 3.2.1 Convex optimization 17 3.2.2 Orthogonal matching pursuit 18 3.3 Approach of array-HRTF for headphone rendering 18 Chapter 4 SIMULATION 24 Chapter 5 EXPERIMENT 35 Chapter 6 CONCLUSIONS 49 REFERENCES 50

    [1] M. R. Bai, Y. H. Yao, C. S. Lai, and Y. Y. Lo, “Design and implementation of a space domain spherical microphone array with application to source localization and separation, ” J. Acoust. Soc. Am., vol. 139, 2016, pp. 1058-1070.
    [2] T. D. Abhayapala, and D. B. Ward, “Theory and design of high order sound field microphones using spherical microphone array,” 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, FL, 2002, pp. II-1949-II-1952.
    [3] J. Meyer, and G. Elko, “A highly scalable spherical microphone array based on an orthonormal decomposition of the soundfield, ” 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, FL, 2002, pp. II-1781-II-1784.
    [4] M. Park, and B. Rafaely, “Sound-field analysis by plane wave decomposition using spherical microphone array,” J. Acoust. Soc. Am., vol.118, 2015, pp. 3094–3103.
    [5] G. W .Elko, R. A. Kubli, and J. M. Meyer, “Audio system based on at least second-order eigenbeams,” U.S. Patent No.7587054, 2009.
    [6] G. W .Elko, R. A. Kubli, and J. M. Meyer, “Audio system based on at least second-order eigenbeams,” U.S. Patent No. 8433075, 2013.
    [7] M. R. Bai, C. S. Lai, and P. C. Wu, “Localization and separation of acoustic sources by using a 2.5-dimensional circular microphone array”, J. Acoust. Soc. Am, vol. 139, 2016, pp. 286-297.
    [8] Z. Wang, H. Zhang, and G. Bi, “Speech Signal Recovery Based on Source Separation and Noise Suppression,” Journal of Computer and Communications, vol.2, 2014, pp. 112-120
    [9] Y. H. Kim, and J. W. Choi, “Sound Visualization and Manipulation,” Wiley, Singapore, Chap. 4, 2013.
    [10] M. R. Bai, C.W. Tung, and C.C. Lee, “Optimal design of loudspeaker arrays for robust cross-talk cancellation using the Taguchi method and the genetic algorithm,” J. Acoust. Soc. Am. vol.117, 2005, pp. 2802-2813.
    [11] P. C. Loizou, “Speech Enhancement: Theory and Practice,” Taylor & Francis, London, 2007.
    [12] T. E. Tuncer, and B. Friedlander, “Classical and Modern Direction-of-Arrival Estimation, ” Academic Press, United States, Chap. 4-6 ,2009.
    [13] M. Bertero, T. Poggio, and V. Torre, “Ill-Posed Problems in Early Vision,” Proceedings of the IEEE, vol.76, 1988, pp. 869-889.
    [14] J. Capon, “High-resolution frequency-wave number spectrum analysis,” Proceedings of the IEEE, vol.57, 1969, pp. 1408-1418.
    [15] R. O. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE Transaction on Antennas and Propagation, vol.34, 1986, pp. 276-280.
    [16] A. N. Tikhonov, “Solution of nonlinear integral equations of the first kind,” Soviet Math. Dokl., vol.5, 1964, pp. 835-838.
    [17] M. Bertero, T. Poggio, and V. Torre, “Ill-Posed Problems in Early Vision,” Proceedings of the IEEE, vol.76, 1988, pp. 869-889.
    [18] P. R. Johnston, and R. M. Gulrajani, “Selecting the Corner in the L-Curve Approach to Tikhonov Regularization,” IEEE Transactions on Biomedical Engineering, vol.47, 2000, pp. 1296-1296.
    [19] J. Candes, and M. B. Wakin, “An introduction to compressive sampling,” IEEE Signal Processing Magazine, vol.25, 2008, pp. 21-30.
    [20] G. F. Edelmann, and C. F. Gaumond, “Beamforming using compressive sensing,” The Journal of the Acoustical Society of America, vol.130, 2011.
    [21] S. Boyd, and L. Vandenberghe, “Convex optimization,” Cambridge University Press, New York, Chap. 1-7, 2004.
    [22] M. R. Bai, and C. C. Chen, “Application of Convex Optimization to Acoustical Array Signal Processing,” Journal of Sound and Vibration, vol.332, 2013, pp. 6596-6616.
    [23] M. Grant, and S. Boyd, “CVX: Matlab Software for Disciplined Convex Programming” Version 1.21 MATLAB software, http://cvxr.com/cvx, 2013.
    [24] T. T. Cai and L. Wang, “Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise,” IEEE Transactions on Information Theory, vol. 57, 2011, pp. 4680-4688.
    [25] ITU-T Recommendation P.862, “Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs,” International Telecommunication Union, Geneva, Switzerland, 2001.
    [26] ITU-T Recommendation P.862.2, “Wideband extension to Recommendation P.862 for the assessment of wideband telephone networks and speech codecs,” International Telecommunication Union, Geneva, Switzerland, 2007.
    [27] E. Vincent, R. Gribonval, and C. Fevotte “Performance measurement in bblind audio source separation,” IEEE Transactions on Audio, Speech, and Language Processing, vol.14, 2006, pp. 1462-1469.
    [28] M. Rothbucher, M. Durkovic, T. Habigt, H. Shen and K. Diepold, “HRTF-Based Localization and Separation of Multiple Sound Sources,” 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication, Paris, 2012, pp. 1092-1096.
    [29] M. Farmani, M. S. Pedersen, Z. H. Tan, and J. Jensen, “On the influence of microphone array geometry on HRTF-based sound localization,” 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, 2015, pp. 439-443.
    [30] J. A. MacDonald, “A localization algorithm based on head-related transfer functions,” The Journal of the Acoustical Society of America, vol.123, 2008, pp. 4290-4296.
    [31] N. R. Shabtaia, “Optimization of the directivity in binaural sound reproduction beamforming,” The Journal of the Acoustical Society of America, vol.138, 2015, pp. 3118-3128
    [32] B. Gardner, and K. Martin, “HRTF Measurements of a KEMAR Dummy-Head Microphone,” MIT Media Laboratory, http://sound.media.mit.edu/resources/KE
    MAR.html, 2000

    QR CODE