研究生: |
廖育志 Liao, Yu-Zhi |
---|---|
論文名稱: |
結合雜訊抑制與帶聲語音重建之語音增強系統 A Speech Enhancement System based on Noise Suppression and Voiced Speech Reconstruction |
指導教授: |
王小川
Wang, Hsiao-Chuan |
口試委員: |
李琳山
陳信宏 王逸如 王小川 |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2011 |
畢業學年度: | 99 |
語文別: | 中文 |
論文頁數: | 56 |
中文關鍵詞: | 語音增強 、雜訊預估 、頻譜增益函數預估 、帶聲語音偵測 、弦波模型重建 、線性預測濾波 |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
單聲道語音增強的相關研究,一直是語音研究中十分活躍的一塊領域,尤其當語音號夾雜了許多背景雜訊,希望可以藉由語音增強的技術,將背景雜訊去除,還原原來的語音訊號。在日常生活中,有許多語音技術相關的應用產品,使用時容易受到外來背景雜訊的干擾,所以語音增強技術好壞,也可以視為語音相關產品能不能被社會大眾所以接受的關鍵。
基於以上的問題,本文提出了兩階段的語音增強方法。
第一階段:
在這階段要做的兩件事情,第一是以MCRA方法下做雜訊的預估,第二是改良OMLSA做增益函數的預估。利用帶聲音框的偵測結果,額外加入邊界的限制條件,使得增益函數預估更能貼近真實的狀態,帶有聲音框的偵測,是利用正規法線性殘餘訊號比值進行判別。
第二階段:
在此階段主要是利用弦波模型做語音訊號的重建,目的是加強語音中的諧波成分,第一是利用前一個音框的基頻預測結果來決定該音框的基頻參考值,第二是利用線性預測濾波的方式,將訊號中不帶聲音框的非週期性成分去除,進一步強化語音中的諧波成分,提升語音品質。
實驗結果以分段式訊雜比改進量與語音品質感知評估(PESQ)做為評量,本文提出語音增強方式各個階段的比較,結果顯示本文所提出的方法在分段式訊雜比改進量的評量上,明顯優於傳統利用MCRA-OMLSA架構的語音增強方法,但是在語音品質感知評估評量中,兩者並無顯著差異。
[1] S.F. Boll (1979), ”Suppression of acoustic noise in speech using spectral subtraction, ”IEEE Trans. Acoustic. Speech Signal Processing, vol. ASPP-27, pp.113-120, Apr. 1979.
[2] ITU-T[2001,Feb] Perceptual evaluation of speech quality[PESQ]:An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs.
[3] Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator” IEEE Trans. Acoustic . Speech Signal Processing, vol.ASSP-33, pp.443-445, Apr. 1985.
[4] R. Martin “Spectral subtraction based on minimum statistics” in Proc 7th Eur. Signal Processing Conf. (EUSIPCO’ 94), pp. 1182-1185, Eingburgh, U.K, Sept. 13-16, 1994.
[5] I. Cohen, B. Berdugo, “Noise estimation by minima controlled recursive averaging for robust speech enhancement” IEEE Signal Processing Letters, pp. 12-15, Jan.2002.
[6] I. Cohen, “Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging” IEEE Transactions on Speech and Audio Processing, vol. 11, no.5, pp. 466-475, 2003.
[7] J. Lim and A. Oppenheim, “Enhancement and bandwidth compression of noisy speech” Processing of the IEEE, vol.97, no.12
[8] Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error log-spectral amplitude estimator” IEEE Trans. Acoustic Speech Signal Processing vol.ASSP-33, pp.443-445, Apr. 1985
[9] Israel Cohen and Baruch Berdugo, “Speech enhancement for non-stationary noise environments” Signal Process, vol.81, no.11, pp.2403-2418, Nov.2001.
[10] Israel Cohen and Baruch Berdugo, “Speech enhancement for non-stationary noise environments” Signal Process, vol.81, no.11, pp.2403-2418, Nov.2001
[12] J. Tabrikian, S.Dubnov, and Y.Dickalov, “Speech enhancement by Harmonic modeling via MAP pitch tracking” in Proc. ICASSP, 200.
[13] Y. Stark and J. Tabrikian “MMSE-based Speech Enhancement Using the Harmonic Model ” International Convention of Electrical and Electronics Engineers in Israel, Dec. 2008
[14] J. Wung, S Miyabe ,and Bi-H Juang “Speech Enhancement Using Minimum Mean-Square Error Estimation And A Post-Filter Derived From Vector Quantization Of Clean Speech” in Proc. ICASSP, May,2009, pp. 4657-4660
[15] J Ma, C Loizou “ SNR loss : A new objective measure for predicting the intelligibility of noise-suppressed speech” in Proc. Speech Communication Oct 2010
[16]王小川 博士 編著,”語音訊號處理” 2009
[17] http://webee.technion.ac.il/Sites/People/IsraelCohen/
[18] http://www.utdallas.edu/~loizou/speech/software.htm
[19] T Yoshioka, T Nakatani, and H G. Okuno “Noisy speech enhancement based on prior knowledge about spectral envelope and harmonic structure” in Proc, ICASSP, 2010, pp. 4270-4273