研究生: |
杜宗憲 Du, Zong-Sian |
---|---|
論文名稱: |
雜訊刪減與有聲語音訊號重建之研究 A Study on Noise Reduction and Voiced Speech Signal Reconstruction |
指導教授: |
王小川
Wang, Hsiao-Chuan |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2009 |
畢業學年度: | 98 |
語文別: | 中文 |
論文頁數: | 74 |
中文關鍵詞: | 語音增強 、弦波模型 |
外文關鍵詞: | speech enhancement, harmonic model |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
生活中的一些應用,例如行動手機、聽障者的助聽系統、自動語音辨識(ASR)系統等,都會有環境雜訊干擾的問題,必須克服雜訊的干擾,讓聽者可以聽到品質較好的語音或是改善系統的辨識率。一般的單一通道語音增強技術,需要先做雜訊的預估,接著再計算頻譜增益函數用以抑制語音中含有的雜訊。
本文提出一個兩階段的單通道語音增強方法,第一階段以短時間頻譜大小預估(STSA estimation , Short-Time Spectral Amplitude estimaion)為基礎來處理,第二階段以弦波模型重估有聲語音訊號,增強其諧振成份,抑制諧振間的雜訊。針對最小控制遞迴平均(MCRA , Minima Controlled Recursive Averaging)雜訊預估對於語音存在與否的偵測,提出改善的方式,使得偵測的結果較為清楚,進一步改善雜訊的預估。直接決定法(Decision-Directed)預估事先訊噪比時,將平滑常數改為隨著音框而改變,以達到反應頻譜的變化。弦波模型法則是將第一階段增強後的有聲語音訊號,進一步增強其諧振成分,改善語音的品質。
實驗的結果使用分段訊噪比改進和語音品質感知評估(PESQ)作評量,本文提出的語音增強方式作各階段的比較,結果顯示二階段語音增強方式有再一次改善語音品質感知評估的效果,但在聽覺上的改善比較不明顯。
[1] S.F. Boll (1979), “Suppression of acoustic noise in
speech using spectral subtraction,”IEEE Trans.
Acoustic., Speech, Signal Processing, vol. ASPP-27,
pp.113-120, Apr. 1979.
[2] J. Lim and A. Oppenheim, “All pole modeling of
degraded speech ,”IEEE Trans.
Acoust. ,Speech. Signal Processing, vol. ASSP-26 , pp.
197-210 , June 1978.
[3] U.Mittal and N.Phamdo,“Singal/noise KLT based
approach for enhancing speech degraded by colored
noise,” IEEE Transactions on Speech and Audio
Processing, vol.8, no.2, March 2000.
[4] ITU-T.[2001,Feb.] Perceptual evaluation of speech
quality[PESQ]:An objective method for end-to-end
speech quality assessment of narrow-band telephone
networks and speech codecs.
[5] Yu Hu and Philipos C Loizou,“Evaluation of objective
quality measures for speech enhancement,”IEEE Trans.
on audio, speech and language pro. , vol.16,
no.1,January 2008.
[6] R. Martin,“Spectral subtraction based on minimum
statistics,” in Proc. 7th
Eur. Signal Processing Conf. (EUSIPCO’94), pp. 1182-
1185, Edingburgh, U.K, Sept.
13-16, 1994.
[7] I. Cohen, B. Berdugo,“Noise estimation by minima
controlled recursive averaging for
robust speech enhancement,”IEEE Signal Processing
Letters, pp. 12-15, Jan.2002.
[8] I. Cohen,“Noise spectrum estimation in adverse
environments: improved minima
controlled recursive averaging,”IEEE Transactions on
Speech and Audio Processing,
vol. 11, no. 5, pp. 466-475, 2003.
[9] J. Lim and A. Oppenheim, “Enhancement and bandwidth
compression of noisy
speech,”Processings of the IEEE, vol.67, no.12,
December 1979.
[10] Y. Ephraim and D. Malah,“Speech enhancement using a
minimum mean-square
error short-time spectral amplitude estimator,”IEEE
Trans. Acoustic. , Speech,
Signal Processing, vol. ASSP-33, pp.443-445, Apr. 1985.
[11] Y. Ephraim and D. Malah, “Speech enhancement using a
minimum mean-square
error log-spectral amplitude estimator,”IEEE Trans.
Acoustic., Speech Signal
Processing, vol. ASSP-33, pp.443-445, Apr. 1985.
[12] Israel Cohen and Baruch Berdugo,“ Speech enhancement
for non-stationary noise
environments,”Signal Process., vol.81, no.11, pp.2403-
2418, Nov. 2001.
[13] Israel Cohen,“Speech enhancement using a noncausal a
priori SNR estimator, ”Signal Processing, vol.11,
no.11, Sep. 2004.
[14] Israel Cohen,“On the decision-directed estimation
approach of Ephraim and Malah, ”in Proc. 29th IEEE
Int. Conf. Acoust. Speech Signal Process., ICASSP-2004
Montreal, QC, Canada , vol. I , no.12, pp.293-296, May
2004.
[15] S.Rangachari, P.C. Loizou, and Y.Hu, “ A noise
estimation algorithm with rapid adaptation for highly
non-stationary environments,”in Proceedings of IEEE
International Conference on Acoustics, Speech, and
Signal Processing, vol.1, pp.305-308, Montreal,
Quebec, Canada, May 2004.
[16] J.W. Shin, H.J.Kwon, S.H. Jin, and N.S. Kim, “ Voice
activity detection based on conditional map
criterion,”IEEE Signal Process. Lett. , vol.15, no.2,
pp. 257-260, Feb. 2008.
[17] Md. Kamrul Hasan, Sayeef Salahuddin, and M.Rezwan Khan,
“A modifed a priori SNR for speech enhancement using
spectral rules, ”IEEE Signal Processing Letters,
vol.11, no.4, April 2004.
[18] M.D. Skowronski, J.G. Harris, “ Applied principles of
clear and Lombard speech for automated intelligibility
enhancement in noisy environments,”Speech
Communication, vol.48, no.5, pp. 549-558, May 2006.
[19] J. Tabrikian , S. Dubnov, and Y. Dickalov,“Speech
enhancement by harmonic
modeling via MAP pitch tracking,”in Proc. ICASSP,
2002.
[20] Y.Stark and J.Tabrikian,“MMSE-based Speech
Enhancement Using the Harmonic Model”, International
Convention of Electrical and Electronics Engineers in
Israel,Dec. 2008