研究生: |
楊吉文 Ji-Wen Yang |
---|---|
論文名稱: |
以麥克風陣列與語音預估作語音增強之研究 Speech Enhancement Based on Microphone Array and Speech Estimation |
指導教授: |
王小川
Hsiao-Chuan Wang |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2006 |
畢業學年度: | 95 |
語文別: | 中文 |
論文頁數: | 55 |
中文關鍵詞: | 麥克風陣列 、語音預估 、噪音消除 |
相關次數: | 點閱:1 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文是以麥克風陣列為出發點來進行噪音消除,首先使用功率頻譜相位,空間功率頻譜,訊號在時軸的互相關性等三種方法來預估麥克風間的時間延遲,之後進行延遲相加之波束形成(delay and sum beamforming),希望能夠消除均值為零的背景噪音。接著在麥克風陣列的系統中,以適應性濾波器來進行噪音消除,其中是以最小平方誤差(least square error)的理論來演算,並且在適應性濾波中參考噪音的產生方式,再次利用LSE的觀念來進行改進。
經過以上的運算轉換成單一通道的訊號。並將其轉至頻域,來估計剩餘噪音的頻譜。使用的方法稱為遞迴平均最小值(minima controlled recursive averaging, MCRA)預估法,它是以遞迴的方式來一步一步的估計噪音的頻譜,並利用預估局部最小值的方式以及搭配類似訊噪比的參數及一些臨界值,來預估各個頻率含有語音否,並根據上述判斷來預估噪音。接著利用已經估計出來的噪音頻譜來進行語音增強,最基本的方法是頻譜刪減法(spectral subtraction) 但是這樣的操作在轉回時域的時候會產生音樂性噪音(musical noise)。因此我們便使用另外一種方法,稱為對數頻譜(log-spectral amplitude)預估,它是以含噪音的語音訊號(noisy)為輸入,來估計一個增益函式(gain function),使得輸入訊號乘上它,能夠得到乾淨的語音。原理則與估計噪音能量類似,利用語音存在及語音不存在兩種條件,分別對各頻帶作動態的改變增益函式。當偵測出接近語音訊號,則調整一個較大的增益,反之,則調整成一個較小的增益。
最後再利用反傅立業轉換把訊號轉回時域,並使用重疊相加(overlap and add)演算法,重建訊號波形。
[1] I. Cohen, B. Berdugo ,“ Microphone array post-filtering for non-stationary noise suppression,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal processing, May 2002, pp. 13-17.
[2] C. Marro, Y. Mahieux, K.U. Simmer, “Analysis of noise reduction and de-reverberation techniques based on microphone arrays with post-filtering,” IEEE Trans. Speech and Audio Processing, vol. 6, pp. 240-259, May 1998.
[3] C. Marro, C. Plapous, P. Scalart , “Improved Signal-to-Noise Ratio Estimation for Speech Enhancement,” IEEE Trans. Speech and Audio Processing, vol. 6, pp. 1-11, May 2006.
[4] Y. Denda, T. Nishiura, H. Kawahara ,“Noisy speech recognition with microphone array steering and Fourier/wavelet spectral subtraction,” IEEE Workshop on Statistical Signal Processing, pp. 593- 596, October 2003.
[5] M. Omologo, P. Svaizer,“ Acoustic event localization using a cross-power spectrum phase based technique,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal processing, April 1994, Volume 2, pp. 273- 276.
[6] M. Omologo, P. Svaizer,“ Use of the Cross-power spectrum in acoustic event localization,” IRST Technical Report #9303-13. March 1993.
[7] M. Omologo, P. Svaizer , ”Acoustic source location in noisy and reverberant environment using CSP analysis,” IEEE Transaction on Speech and Audio Processing, May 1996 , vol. 2, pp. 921 - 924.
[8] 陳益正,“使用強健性時間延遲與訊號子空間方法於麥克風陣列語音加強”,國立成功大學資訊工程研究所碩士論文,2003
[9] S. Gannot , D. Burshtein , E. Weinstein, “ Analysis of the power spectral deviation of the general transfer function GSC,”IEEE Transactions on signal processing, vol. 52, No.4, pp. 1115–1120, April 2004.
[10] S. Gannot , I. Cohen , “Speech enhancement bases on the general transfer function GSC and post-filtering,” IEEE Transactions on Speech and Audio processing vol. 12, No.6, November 2004.
[11] R. Martin ,“Noise power spectrum density estimation based on optimal smoothing and minimum statistics,” IEEE Trans. Speech and audio Processing, vol. 9, pp. 504-512, July 2001.
[12] I. Cohen, B. Berdugo , “ Noise estimation by minima controlled recursive averaging for robust speech enhancement,” IEEE Signal Processing Letters, Volume 9, Issue 1, pp. 12–15, Jan. 2002.
[13] Y. Ephraim, D. Malah , “Speech enhancement using a minimum mean square error short-time spectral amplitude estimator”, IEEE. Trans. Acoust. Speech Signal Processing. ASSP-32(6), pp. 1109-1121, December 1984
[14] I. Cohen ,“Optimal Speech Enhancement Under Signal Presence Uncertainty Using Log-Spectral Amplitude Estimator” IEEE Signal Processing Letters, Vol. 9, No. 4, pp. 113-116, Apr. 2002,
[15] Iain A. McCowan, Herve Bourlard ,“Microphone array post-filter based on noise field coherence ”IEEE Trans. Speech and Audio Processing, vol. 11, pp. 709-716, November 2003.
[16] Y. Ephraim, D. Malah , “Speech enhancement using a minimum mean square error log- spectral amplitude estimator”, IEEE. Trans. Speech and Audio Processing. Vol. 33, No. 2, pp. 443-445. April 1985
[17] I. Cohen, B. Berdugo , “ Speech Enhancement for Non-stationary noise environments,” Signal Processing, vol. 81, pp. 2403-2418, Aug. 2001.
[18] 王小川編著,“語音訊號處理”,全華科技圖書公司,2004
[19] 黃柏凱,“以可聽覺遮蔽門檻為基礎的語音增強研究”,國立清華大學電機工程學系碩士論文, 2005