簡易檢索 / 詳目顯示

研究生: 杜宗憲
Du, Zong-Sian
論文名稱: 雜訊刪減與有聲語音訊號重建之研究
A Study on Noise Reduction and Voiced Speech Signal Reconstruction
指導教授: 王小川
Wang, Hsiao-Chuan
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2009
畢業學年度: 98
語文別: 中文
論文頁數: 74
中文關鍵詞: 語音增強弦波模型
外文關鍵詞: speech enhancement, harmonic model
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 生活中的一些應用,例如行動手機、聽障者的助聽系統、自動語音辨識(ASR)系統等,都會有環境雜訊干擾的問題,必須克服雜訊的干擾,讓聽者可以聽到品質較好的語音或是改善系統的辨識率。一般的單一通道語音增強技術,需要先做雜訊的預估,接著再計算頻譜增益函數用以抑制語音中含有的雜訊。
    本文提出一個兩階段的單通道語音增強方法,第一階段以短時間頻譜大小預估(STSA estimation , Short-Time Spectral Amplitude estimaion)為基礎來處理,第二階段以弦波模型重估有聲語音訊號,增強其諧振成份,抑制諧振間的雜訊。針對最小控制遞迴平均(MCRA , Minima Controlled Recursive Averaging)雜訊預估對於語音存在與否的偵測,提出改善的方式,使得偵測的結果較為清楚,進一步改善雜訊的預估。直接決定法(Decision-Directed)預估事先訊噪比時,將平滑常數改為隨著音框而改變,以達到反應頻譜的變化。弦波模型法則是將第一階段增強後的有聲語音訊號,進一步增強其諧振成分,改善語音的品質。
    實驗的結果使用分段訊噪比改進和語音品質感知評估(PESQ)作評量,本文提出的語音增強方式作各階段的比較,結果顯示二階段語音增強方式有再一次改善語音品質感知評估的效果,但在聽覺上的改善比較不明顯。


    摘要................................................I 誌謝辭..............................................I 目錄................................................II 圖表目錄............................................IV 第一章 緒論.........................................1 1. 1 研究背景.......................................1 1. 2 研究方向.......................................2 1. 3 論文大綱.......................................3 第二章 雜訊預估的概述...............................4 2. 1 最小統計量(MS).................................4 2. 2 最小控制遞迴平均(MCRA).........................6 2. 3 改善最小控制遞迴平均(IMCRA)....................10 第三章 語音增強演算法的概述.........................15 3. 1 頻譜刪減(SS)...................................15 3. 2 溫尼濾波器(Wiener Filter)......................16 3. 3 短時間頻譜大小最小均方誤差(MMSE-STSA)..........17 3. 4 對數頻譜大小最小均方誤差(MMSE-LSA).............19 3. 5 對數頻譜大小最佳修正(OM-LSA)...................22 3. 5. 1事先語音不存在機率預估.......................22 第四章 改善語音頻譜估測.............................25 4. 1 語音存在與否頻帶判斷方式.......................26 4. 2 事先訊噪比( a prior SNR )預估..................36 4. 3 第二階段語音處理...............................39 第五章 實驗結果與討論...............................43 5. 1 實驗介紹.......................................43 5. 2 實驗評量方法...................................43 5. 3 實驗結果與討論.................................46 第六章 結論與未來展望...............................70 6. 1 結論...........................................70 6. 2 未來展望.......................................71 參考文獻............................................72

    [1] S.F. Boll (1979), “Suppression of acoustic noise in
    speech using spectral subtraction,”IEEE Trans.
    Acoustic., Speech, Signal Processing, vol. ASPP-27,
    pp.113-120, Apr. 1979.
    [2] J. Lim and A. Oppenheim, “All pole modeling of
    degraded speech ,”IEEE Trans.
    Acoust. ,Speech. Signal Processing, vol. ASSP-26 , pp.
    197-210 , June 1978.
    [3] U.Mittal and N.Phamdo,“Singal/noise KLT based
    approach for enhancing speech degraded by colored
    noise,” IEEE Transactions on Speech and Audio
    Processing, vol.8, no.2, March 2000.
    [4] ITU-T.[2001,Feb.] Perceptual evaluation of speech
    quality[PESQ]:An objective method for end-to-end
    speech quality assessment of narrow-band telephone
    networks and speech codecs.
    [5] Yu Hu and Philipos C Loizou,“Evaluation of objective
    quality measures for speech enhancement,”IEEE Trans.
    on audio, speech and language pro. , vol.16,
    no.1,January 2008.
    [6] R. Martin,“Spectral subtraction based on minimum
    statistics,” in Proc. 7th
    Eur. Signal Processing Conf. (EUSIPCO’94), pp. 1182-
    1185, Edingburgh, U.K, Sept.
    13-16, 1994.
    [7] I. Cohen, B. Berdugo,“Noise estimation by minima
    controlled recursive averaging for
    robust speech enhancement,”IEEE Signal Processing
    Letters, pp. 12-15, Jan.2002.
    [8] I. Cohen,“Noise spectrum estimation in adverse
    environments: improved minima
    controlled recursive averaging,”IEEE Transactions on
    Speech and Audio Processing,
    vol. 11, no. 5, pp. 466-475, 2003.
    [9] J. Lim and A. Oppenheim, “Enhancement and bandwidth
    compression of noisy
    speech,”Processings of the IEEE, vol.67, no.12,
    December 1979.
    [10] Y. Ephraim and D. Malah,“Speech enhancement using a
    minimum mean-square
    error short-time spectral amplitude estimator,”IEEE
    Trans. Acoustic. , Speech,
    Signal Processing, vol. ASSP-33, pp.443-445, Apr. 1985.
    [11] Y. Ephraim and D. Malah, “Speech enhancement using a
    minimum mean-square
    error log-spectral amplitude estimator,”IEEE Trans.
    Acoustic., Speech Signal
    Processing, vol. ASSP-33, pp.443-445, Apr. 1985.
    [12] Israel Cohen and Baruch Berdugo,“ Speech enhancement
    for non-stationary noise
    environments,”Signal Process., vol.81, no.11, pp.2403-
    2418, Nov. 2001.
    [13] Israel Cohen,“Speech enhancement using a noncausal a
    priori SNR estimator, ”Signal Processing, vol.11,
    no.11, Sep. 2004.
    [14] Israel Cohen,“On the decision-directed estimation
    approach of Ephraim and Malah, ”in Proc. 29th IEEE
    Int. Conf. Acoust. Speech Signal Process., ICASSP-2004
    Montreal, QC, Canada , vol. I , no.12, pp.293-296, May
    2004.
    [15] S.Rangachari, P.C. Loizou, and Y.Hu, “ A noise
    estimation algorithm with rapid adaptation for highly
    non-stationary environments,”in Proceedings of IEEE
    International Conference on Acoustics, Speech, and
    Signal Processing, vol.1, pp.305-308, Montreal,
    Quebec, Canada, May 2004.
    [16] J.W. Shin, H.J.Kwon, S.H. Jin, and N.S. Kim, “ Voice
    activity detection based on conditional map
    criterion,”IEEE Signal Process. Lett. , vol.15, no.2,
    pp. 257-260, Feb. 2008.
    [17] Md. Kamrul Hasan, Sayeef Salahuddin, and M.Rezwan Khan,
    “A modifed a priori SNR for speech enhancement using
    spectral rules, ”IEEE Signal Processing Letters,
    vol.11, no.4, April 2004.
    [18] M.D. Skowronski, J.G. Harris, “ Applied principles of
    clear and Lombard speech for automated intelligibility
    enhancement in noisy environments,”Speech
    Communication, vol.48, no.5, pp. 549-558, May 2006.
    [19] J. Tabrikian , S. Dubnov, and Y. Dickalov,“Speech
    enhancement by harmonic
    modeling via MAP pitch tracking,”in Proc. ICASSP,
    2002.
    [20] Y.Stark and J.Tabrikian,“MMSE-based Speech
    Enhancement Using the Harmonic Model”, International
    Convention of Electrical and Electronics Engineers in
    Israel,Dec. 2008

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE