簡易檢索 / 詳目顯示

研究生: 林典蔚
Dain-Wei Lin
論文名稱: 語音訊號中的雜訊預估與刪減方法研究
A Study on the Noise Estimation and Reduction of Speech Signals
指導教授: 王小川
Hsiao-Chuan Wang
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2007
畢業學年度: 95
語文別: 英文
論文頁數: 66
中文關鍵詞: 雜訊預估語音增強事前訊噪比預估器事前非語音存在機率預估頻譜增益語音
外文關鍵詞: noise estimation, speech enhancement, priori SNR estimator, priori SAP estimator, spectral gain, speech
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 語音增強技術對於是一個非常廣泛的重要應用,例如:許多不同的語音辨識系統與最常見的語音通訊系統,均可藉由此技術的加入而提升效能。對於一般的單一通道語音增強技術需要在前端先做雜訊預估,並且在後端計算其頻譜增益用以增強語音。在非穩態的實際環境下,藉由含有噪音的語音來準確地估計雜訊是非常困難的。
    因此,在本篇論文中,前端部分藉由加入一個「因果式」事前訊噪比預估器(Causal Priori SNR estimator) 與一個事前非語音存在機率預估器(priori SAP estimator) 來進一步改良「noise estimation with rapid adaptation (NERA)」此篇論文中所提出來的雜訊預估演算法;而後端的頻譜增益則利用「two-step noise reduction (TSNR)」演算法,使其增強後的語音能夠達到更好的品質。
    實驗部份分成兩部份,分別對於錄音含噪語音訊號(Recorded Noisy Speech) 與人工含噪語音訊號(Artificial Noisy Speech)來做語音增強處理,後者更針對雜訊強度與種類的改變來模擬非穩態環境。實驗結果顯示出此演算法可以更準確的預估雜訊,並能與後端的頻譜增益結合,得到最佳的效果。


    Speech enhancement is a very important application in many aspects. For example: a speech enhancement process can be the front-end of the speech recognition system and the digital voice communication system. It is difficult to estimate noise in the single channel case where only noisy speech can be accessed in non-stationary noise environments. A common way is to estimate noise and calculate a spectral gain to enhance speech. In this thesis, a noise estimation algorithm which modifies the noise estimation with rapid adaptation (NERA) algorithm is proposed. It combines a causal priori estimator and a speech absence probability estimator and applies the two-step noise reduction (TSNR) technique to enhance the speech signal. The experimental results show that the proposed speech enhancement algorithm can efficiently track the noise in various noise types and levels. It gets a more accurate estimation than the NERA algorithm and makes the quality of the enhanced speech better.

    Contents 中文摘要...................................................i Abstract..................................................ii Contents.................................................iii List of figures...........................................iv List of tables.............................................v CHAPTER 1 Introduction.....................................1 1.1 Background and motivation..............................1 1.2 Surveys of relative research...........................2 1.3 Outline of the proposed speech enhancement system......3 1.4 Organization of this thesis............................5 CHAPTER 2 Review of Existing Noise Estimation Algorithms...6 2.1 Minimum statistics (MS) noise estimation...............7 2.2 Minima controlled recursive averaging (MCRA)...........9 2.3 Improved minima controlled recursive averaging (IMCRA)...................................................13 2.4 Noise estimation with rapid adaptation (NERA).........17 CHAPTER 3 Review of Speech Enhancement Algorithms.........20 3.1 Spectral subtraction (SS).............................21 3.2 Wiener filter.........................................22 3.3 Minimum mean-squared error short-time spectral amplitude Estimator (MMES-STSA)...........................23 3.4 Minimum mean-squared error log-spectral amplitude estimator (MMES-LSA)......................................26 3.5 Optimally-modified log-spectral amplitude estimator (OM-LSA)......................................................28 CHAPTER 4 Proposed Speech Enhancement Algorithm...........29 4.1 Proposed noise estimation algorithm...................29 4.2 Speech enhancement based on TSNR approach.............40 CHAPTER 5 Experimental Results and Discussions............43 5.1 Description of the experiment.........................43 5.2 Relative noise estimation error and segmental SNR improvement...............................................44 5.3 Experimental results and discussions..................45 CHAPTER 6 Conclusion and Future Work......................62 6.1 Conclusion............................................62 6.2 Future work...........................................62 Reference.................................................64

    Reference

    [1] S. F. Boll (1979), “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. Acoustic., Speech, Signal Processing, vol. ASPP-27, pp. 113-120, Apr. 1979
    [2] I. Cohen, B. Berdugo (2002), “Noise estimation by minima controlled recursive averaging for robust speech enhancement,” IEEE Signal Processing Letters, pp. 12-15, Jan. 2002.
    [3] I. Cohen (2003), “Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging,” IEEE Transactions on Speech and Audio Processing, vol. 11, no. 5, pp. 466–475, 2003.
    [4] I. Cohen and B. Berdugo (2001), “Speech enhancement for non-stationary noise environments,” Signal Process., vol. 81, no. 11, pp. 2403-2418, Nov. 2001.
    [5] I. Cohen (2004), “On the decision-directed estimation approach of Ephraim and Malah,” in Proc. 29th IEEE Int. Conf. Acoust. Speech Signal Process., ICASSP-2004 Montreal, QC, Canada, vol. I, no. 12, pp. 293-296, May 2004.
    [6] O. Cappé (1994), “Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 2, no. 2, pp. 345-349, Apr. 1994.
    [7] G. Doblinger (1995), “Computationally efficient speech enhancement by spectral minima tracking in subbands” in Proc. EUROSPEECH, vol. 2, pp. 1513-1516, 1995.
    [8] Y. Ephraim and D. Malah (1984), “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,” IEEE Trans. Acoustic., Speech, Signal Processing, vol. 32, pp. 1109-1121, Dec. 1984.
    [9] Y. Ephraim and D. Malah (1985), “Speech enhancement using a minimum mean-square error log- spectral amplitude estimator,” IEEE Trans. Acoustic., Speech, Signal Processing, vol. ASSP-33, pp. 443-445, Apr. 1985.
    [10] P. K Huang (2005), “Speech enhancement based on audible masking threshold,” MS Thesis, EE Department, National Tsing Hua University, 2005.
    [11] N.S. Kim, J. H. Chang 2000, “Spectral enhancement based on global soft decision,” IEEE Signal Process. Lett. 7, no 5, pp. 108–110, May 2000.
    [12] J. S. Lim and A. V. Oppenheim, “Enhancement and bandwidth compression of noisy speech,” Proc. IEEE, vol. 67, no. 12, pp. 1586-1604, Dec. 1979.
    [13] R. Martin (1994), Spectral subtraction based on minimum statistics, Proceedings of the Seventh European Signal Processing Conference, EUSIPCO-94, pp. 1182–1185, Edinburgh, Scotland, 13–16, September 1994.
    [14] R. Martin (2001), “Noise power spectral density estimation based on optimal smoothing and minimum statistics,” IEEE Trans. Speech Audio Process. 9 (5), pp. 504–512, July 2001.
    [15] D. Malah, R.V. Cox, and A.J. Accardi (1999), “Tracking speech-presence uncertainty to improve speech enhancement in non-stationary noise environments,” IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP1999), pp. 789-792, 1999.
    [16] B. L. McKinley and G. H. Whipple (1997), “Model based speech pause detection,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP’97), Munich, Germany, pp. 1179–1182, Apr 1997.
    [17] C. Plapous, C. Marro, P. Scalart, and L. Mauuary (2004), “A two-step noise reduction technique,” IEEE Int. Conf. on Acoustics, Speech and Signal Proc., vol. 1, pp. 289–292, May 2004.
    [18] S. Rangachari, P. C. Loizou, and Y. Hu (2004), “A noise estimation algorithm with rapid adaptation for highly non-stationary environments,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '04), vol. 1, pp. 305–308, Montreal, Quebec, Canada, May 2004.
    [19] J. Sohn, N. S. Kim, and W. Sung (1999), “A statistical model-based voice activity detector,” IEEE Signal Processing Lett., vol. 6, no. 1, pp. 1-3, Jan. 1999.
    [20] http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
    [21] 王小川 博士 編著 , ”語音訊號處理”2004

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE